SEUGI PARIS
June 2002
INTRODUCING
Turin Olympic Winter Games 2006:
a marketing survey of local government initiatives, conducted with SAS tools.
Flavio Bonifacio, Veronica Baldisserri, Metis sas
1. Forward
As the Olympic Winter Games in 2006 will be held in Turin, the City Administration has contacted the
University and METIS with a view to monitoring public opinion among the inhabitants of Turin regarding
municipality´s initiatives in preparation for to the event.
This CATI survey is interesting because it
allows an overview of some modern instruments of marketing research: from the preparation of a
questionnaire and the monitoring of telephone calls and interviews to the mining and scoring techniques.
In section 2 we are going to describe the problems the survey entailed, in section 3 the process of drawing
up the questions, in section 4 the model, in section 5 some mining methods used, in sections 6 and 7 we
shall introduce the distinction between centralized (SAS/AF™) and decentralized CATI survey methods
(realised by different techniques: SAS/Intrnet™, SAS/WebAF™ and Form Mailer™), and finally in section 8 we
will give an example of the Mining Process using EM 4.0 ™.
We had hoped to summarize the results on this occasion (see abstract), but the deadline for the survey
has not yet been established. We think that the task will be executed in the second part of June.
2. The questions in the survey
Everyone can imagine the amount of organization that is needed for the development of Olympic Games.
Although this is not the first time that Turin hosts an international event (International Students Games,
Football world cup, International Book Fair, International Automotive Fair) on this occasion the effort will
be considerable.
Among other things, the administration has to build many infrastructures from scratch or renew them.
These include buildings and a new auxiliary transport system (the underground, for example).
All this will surely make considerable impact on the daily life of inhabitants in the region.
How will they accept the troubles caused by the works in progress? To what extent do they consider it
reasonable to accept the troubles? What sort of returns do they expect from all this? Are the Olympic
Games really perceived as an advantage for the region? Or, first of all, do the inhabitants perceive the
Olympic Games as an advantage for themselves? What price are they prepared to pay, or to what extent are
they prepared to cooperate for the success of the event?
And, on a different analytical level, what kind of sporting events will they be more attracted by?
These and other similar questions are object of a survey that will be useful to the City Administration in
order to "sell" (or, better, to understand how to "sell") to inhabitants in the region all the troubles
that hosting the Olympic Games will cause, and at the same time obtain their cooperation.
In other and more technical words we are trying to estimate a list of scores which classify the
inhabitants of the Turin region with regards to their attitude towards the advantages and disadvantages
resulting from the Olympic Games related to some other more consistent characteristics
(e.g. biographic and behavioural characteristics).
3. Editing the questionnaire
It is well known that the questions in a questionnaire must be unambiguously linked to the concepts
they measure. Otherwise a concept is often not "writable" in one single sentence and often we need to repeat
questions in different words to avoid misunderstandings and ensure the correct evaluation of the answers.
This may cause redundancy and imprecision. In other words, on one hand exact wording is needed, while on
the other hand a redundant multiplication of items and more "noise bits" are involved.
For example, take the concepts of "troubles" and "advantage" contained in the questions reported before:
"...How will they accept the troubles caused by the works in progress?..." and "...do the inhabitants perceive
the Olympic Games as an advantage for themselves? "
The first concept "troubles" is translated into just one direct question which is considered sufficient
to get an reliable answer:
Concept: Trouble.
Someone says that the Olympic Games will generate troubles and problems.
How often do you think the following situations will create problems for you?
A. Public works
- Very often
- Fairly often
- Occasionally
- Almost never
B. Traffic, parking
- Very often
- Fairly often
- Occasionally
- Almost never
C. Exaggerated Outcomes/expenditure
- Very often
- Fairly often
- Occasionally
- Almost never
D. Expensive sports facilities that may become useless
- Very often
- Fairly often
- Occasionally
- Almost never
E. Confusion, crowding, queues
- Very often
- Fairly often
- Occasionally
- Almost never
F.Environmental problems, pollution
- Very often
- Fairly often
- Occasionally
- Almost never
The second concept "advantage" requires more questions.
The concept of advantage is in fact more ambiguous. What is intended by it? A personal advantage,
or an advantage for the commune or the region? We think we will ask questions on both aspects.
Concept advantage, first meaning:
Do you think you will be directly involved in any of the works for the 2006 Turin
Olympic Winter Games?
Do you think the 2006 Turin Olympic Winter Games will bring you any advantages or
disadvantages?
- Advantages
- Disadvantages
- Neither advantages nor disadvantages
Concept advantage, second meaning:
What sort of impact do you think the 2006 Turin Olympic Winter Games will have on the
region where you live?
- Extremely important
- Very important
- Fairly important
- Not too important
- Not at all important
Generally speaking, do you think that, from an economic point of view, the advantages of hosting
the Olympic Games will be higher than the costs?
- Yes
- No
- No appreciable difference
Finally we may introduce a generic question to stimulate a synthetic opinion about the location of the
Games:
Do you agree or disagree with the choice of Turin for the location of the 2006
Olympic Winter Games?
- Agree
- Disagree
- Don´t know
These and other similar questions will make up the questionnaire.
As stated beforehand (see section 1) our goal is to estimate a model that depicts the probability of people
agreeing with the public works that the Administration will undertake in order to ensure the proper
development of the 2006 Olympic Winter Games. In the next sections of this paper we will discuss some
instances of the model, the statistical methods and some particular tools used to translate questions into
forms for a Computer Aided Telephone Interview.
In particular we will consider three different SAS approaches
(SAS/AF™, SAS/Intrnet™ and SAS/Webaf™ approach) and two different survey management methods:
telephone and internet methods.
4. The models
The logit model we will use to predict probabilities has the dependent (or target) dichotomous variable
high_trouble. The target value of high_trouble is 1 when the person recognise very often that Olympic games
may generate problems.
This variable may be computed via factor analysis of items listed before (see concept trouble) and then
classified into two categories high and low troubles -
depending for example on the frequency distribution. Alternatively we may use the factor values directly
in a more traditional linear regression.
The independent or input variables are biographic (age, gender, employment, dwelling place, level of
education), they regard attitudes (some sort of generic information about the Olympic Games, some sort of
specific information about the Turin Olympic Games - for example the locations of various different games-,
some sort of information about public events or actions,...) and behavioural (how often they attend sports
events in public stadiums, gyms, how often they participate in sports events or how often they simply watch
sports on TV).
In this case we try to discover which elements influence their judgement of troubles in order to address the
correct information to the right people.
Another interesting model has types of behaviour as dependent variable (for example participation in a
specific sports event) and tries to forecast what sort of people follow various sports events and how many
they are.
In both cases we will produce a list of probabilities ranging from 0 to 1 depending on the state of the
other variable in the model. So we may expect, in the former case, that middle aged people will have more
information, will have more confidence in Administration´s actions and therefore will show a low probability (to indicate
problems arising from the organization of the Olympic Games).
At least they will recognise that the advantages of the organization will be higher than the costs.
We may also test the hypothesis that people living in the villages involved will judge the Olympic Games
in totally opposite ways: they will totally agree or totally disagree. Finally we may observe that
high employment and high educational levels will show a low level of trouble.
5. Statistical mining methods
Earlier we spoke about a statistical mining technique, the factor analysis, which will be used to classify
the variable high_trouble. There will be a more consistent use of factor analysis. For example, as shown
earlier, we will use different kinds of questions to test personal and public advantages of the Olympic
Games.
If the reported questions are appropriate, and there really is a difference between the answers given to
the two kinds of questions by the persons interviewed, a factor analysis will extract two different factors:
one related to personal advantages, the other related to public advantages. In this way we will have a more
comfortable and valuable method to justify our theories.
We may then use a more sophisticated discriminating model to classify those persons who don´t at the moment
know whether the organization of the Olympic Games will be a positive factor for Turin. In this way we may
recognise in advance that portion of silent population that will probably change their opinion and became
more favourable in the near future.
We have already described the logistic (linear) model that we intend to use. The model building procedures
are part of the mining methods too.
6. Techniques: using SAS to ask questions. Centralized CATI model: SAS/AF™.
At SEUGI 18 we presented a method to collect data from CATI interviews by using SAS. This method consists
of a SAS/AF™ catalogue containing forms where questions and answer items are presented.
This method performs very well when used inside the organization conducting the survey. In this case all
interviewers will be working in the same building and a local area network is sufficient to link the
operator site to the central Data Base.
This method does not need a specific client/server architecture (the user will set up an ordinary PC as a
server) and only one installation of SAS/Base is needed to run the system. The forms appear as normal
SAS/AF views among which the operator will move depending on the questions flow.
At this point may we remind you of the main tasks performed by this software.
The software allows us to:
- Write the questions in such a way that they can be displayed on the screen
- Read the questions of the survey and save the answers
- Control the flow among the questions
- Manage an address book in order to get telephone numbers, save the results obtained from calls,
register appointments, and so on
- Summarise elementary management statistics by the complete, interrupted or refused calls count
- Do some accounting tasks (cost of calls)
7. Techniques: using SAS to ask the questions. Decentralized CATI model or
internet model: SAS/IntrNet™, SAS/WebAF™ or Form Mailer™?
There are at least two other ways to perform a Computer Aided survey: the first, still by phone, simply
organizes the work of operators who work at home. The other is to ask people to answer directly via
internet .
From the software point of view the first task may be done connecting interviewers to a SAS/Intrnet
Application Server™ or SAS/WebAF™. In this cases the operators will be supplied with the proper Html
In the former case HTML will embed the proper broker parameter. In the latter case HTML will be generated
with embedded JAVA Applet generated by SAS/AppDev STUDIO™.
The second task does not requires any special apparatus: a person just searches a specific www site
in which the html survey forms are hosted, fills it in and then submits the form. On the other
side there is a server where some mailers receive the form data and translate them into a proper
file (i.e. txt file) and an interpreter (which may be a SAS parser program) will
translate the data into a table for subsequent statistical analysis.
8. Techniques: using SAS as a Mining Tool
The target variable is the recognised personal advantage explained by the input variables
related to inconveniences and problems caused by the organization of the Olympic Games.
The Miner Project starts from data definition, runs the regression model, builds the scores and then
shows the data.
Obviously this is only an example involving only few observations, just for illustrative purposes.
|
|